Skip to content

Fix ICE when data section exceeds 2^12 words#7643

Open
Dnreikronos wants to merge 4 commits into
FuelLabs:masterfrom
Dnreikronos:fix/large-data-section-ice
Open

Fix ICE when data section exceeds 2^12 words#7643
Dnreikronos wants to merge 4 commits into
FuelLabs:masterfrom
Dnreikronos:fix/large-data-section-ice

Conversation

@Dnreikronos
Copy link
Copy Markdown
Contributor

Summary

Closes #7612.

  • Root cause: realize_load() panicked for copy-type data section offsets exceeding 12-bit LW/LB immediate range (4095 words / 32KB). Any program with a data section larger than this ICE'd during codegen.
  • Fix: Three-tier offset strategy for copy-type loads:
    • ≤12 bits → single LW/LB with immediate offset (unchanged behavior)
    • >12 bits, ≤18 bitsMOVI + ADD $ds + LW/LB (3 instructions, uses dest register as scratch)
    • >18 bits → panic with clear message (data section ~256KB limit)
  • Size estimation: Updated op_size_in_bytes (finalized_asm.rs) and instruction_size_not_far_jump (allocated_abstract_instruction_set.rs) to return correct instruction counts for both paths, so jump offsets remain aligned.
  • Helper: Added DataSection::size_in_bytes() for pre-pointer-insertion size snapshots.
  • Regression test: large_data_section e2e test creates 4200+ distinct u64 entries (values >262143 to bypass MOVI inlining), verifying correct codegen via checksum.

Test plan

  • cargo check -p sway-core passes
  • large_data_section test passes — exercises the >12-bit copy-type offset path
  • All 333 should_pass/language e2e tests pass with 0 failures
  • CI full suite

LW/LB instructions have 12-bit immediate offsets (max 4095). When
copy-type loads exceeded this, realize_load() panicked. Replace
the panic with a three-tier approach: <=12-bit uses single LW/LB,
>12-bit uses MOVI+ADD+LW/LB (3 instructions), >18-bit panics with
a clear message. Update op_size_in_bytes and
instruction_size_not_far_jump to return matching instruction counts
so jump offsets remain correct.
Exercises the >12-bit offset path by creating 4200+ distinct u64
data section entries (values >262143 to avoid MOVI inlining).
Verifies correct codegen via checksum of all loaded values.
@Dnreikronos Dnreikronos requested a review from a team as a code owner May 27, 2026 12:09
@fuel-cla-bot
Copy link
Copy Markdown

fuel-cla-bot Bot commented May 27, 2026

Thanks for the contribution! Before we can merge this, we need @Dnreikronos to sign the Fuel Labs Contributor License Agreement.

@cursor
Copy link
Copy Markdown

cursor Bot commented May 27, 2026

PR Summary

Medium Risk
Touches core Fuel bytecode emission, pointer pre-insertion, and instruction size estimation used for jumps; wrong sizing would miscompile, but scope is localized to data loads and layout.

Overview
Fixes Fuel codegen ICE when static data section offsets exceed the 12-bit immediate range of LW/LB (roughly >32KB / >4096 words).

realize_load now emits a single immediate load when the offset fits in 12 bits, otherwise MOVI + ADD $ds + LW/LB (18-bit byte offset cap with an assert). addr_of gets the same 18-bit guard.

Instruction sizing in to_bytecode_mut and instruction_size_not_far_jump accounts for the expanded copy-type and non-copy (LoadDataId) sequences, including a worst-case pointer word offset so jump/layout math stays consistent when appended pointer slots also need the large load form.

Data section pointers for non-copy loads are keyed by (source DataId, load_site_offset) via source_to_pointer and pointer_for_load_site; pre-insertion uses predicted pointer offsets and non_configurables_size_in_bytes() before pointers are appended.

Adds e2e large_data_section (4200+ distinct constants, checksum) to exercise the large-offset path.

Reviewed by Cursor Bugbot for commit 6888acd. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread sway-core/src/asm_generation/finalized_asm.rs
Comment thread sway-core/src/asm_generation/finalized_asm.rs Outdated
The hardcoded -4 byte $pc correction assumed the inner copy-type load
always emits 1 instruction. When the pointer entry's word offset exceeds
12 bits (data section >32KB), the inner load emits 3 instructions
(MOVI+ADD+LW/LB), making the ADD $pc land 8 bytes further than the
stored pointer value accounts for.

Three fixes:
- Dynamic $pc correction based on predicted pointer entry position
- Pointer lookup keyed by (source DataId, instruction offset) instead of
  pointer value, preventing collisions when the same non-copy entry is
  loaded at multiple sites
- Worst-case pointer word offset for size estimates, using
  non_configurables_size_in_bytes instead of total section size to avoid
  configurable entries inflating the threshold
Comment thread sway-core/src/asm_generation/finalized_asm.rs
Comment thread sway-core/src/asm_generation/finalized_asm.rs
@Dnreikronos
Copy link
Copy Markdown
Contributor Author

bugbot run

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 6888acd. Configure here.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 4, 2026

Merging this PR will not alter performance

✅ 25 untouched benchmarks


Comparing Dnreikronos:fix/large-data-section-ice (6888acd) with master (ac933ff)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Many assert_eq calls make compiler panic with too big data section

1 participant